Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove use of distinct in event listing #649

Merged
merged 1 commit into from
Aug 8, 2023

Conversation

voneiden
Copy link
Contributor

@voneiden voneiden commented Aug 3, 2023

The use of distinct is extremely slow in the event listing. The need for distinct can be replaced in one-to-many and many-to-many relations by using
exclude(~Q(relation=something). When excluding relations, Django automatically uses SQL EXISTS. Negating the Q statement will essentially create a NOT NOT EXISTS which the database then optimizes into plain EXISTS.

DISTINCT can be avoided by utilizing EXISTS queries in filters that can create duplicate rows. Additionally some filter queries on m2m relations can be made still significantly faster if we target the through table directly to avoid unnecessary INNER JOIN by django.

@voneiden voneiden force-pushed the link-1408-optimize-distinct branch 2 times, most recently from 1c27f6a to 19b0627 Compare August 3, 2023 13:17
@codecov-commenter
Copy link

codecov-commenter commented Aug 3, 2023

Codecov Report

Merging #649 (48c5714) into develop (fa1bcf9) will increase coverage by 0.60%.
Report is 3 commits behind head on develop.
The diff coverage is 96.82%.

❗ Current head 48c5714 differs from pull request most recent head 3324685. Consider uploading reports for the commit 3324685 to get more accurate results

@@             Coverage Diff             @@
##           develop     #649      +/-   ##
===========================================
+ Coverage    75.45%   76.06%   +0.60%     
===========================================
  Files          237      239       +2     
  Lines        17297    17357      +60     
===========================================
+ Hits         13052    13203     +151     
+ Misses        4245     4154      -91     
Files Changed Coverage Δ
events/api.py 87.77% <96.15%> (+0.24%) ⬆️
events/tests/test_keyword.py 100.00% <100.00%> (ø)

... and 4 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@voneiden voneiden force-pushed the link-1408-optimize-distinct branch from 19b0627 to 56bee3e Compare August 3, 2023 18:10
@voneiden
Copy link
Contributor Author

voneiden commented Aug 3, 2023

Yritetty tarkistaa seuraavat relaatiot

Events many-to-many

in_language
images
keywords
audience

Events one-to-many

videos
external_links
offers
aliases
sub_events

@voneiden voneiden force-pushed the link-1408-optimize-distinct branch from 56bee3e to a00fc95 Compare August 4, 2023 06:36
@voneiden voneiden marked this pull request as ready for review August 4, 2023 07:16
@voneiden voneiden requested review from charn and danipran August 4, 2023 07:16
events/api.py Outdated Show resolved Hide resolved
events/api.py Outdated Show resolved Hide resolved
events/api.py Outdated Show resolved Hide resolved
Copy link
Contributor

@danipran danipran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The negated excludes are definitely not intuitive and hurt readability, but I'd say the benefits outweigh the cons. LGTM 👍

@voneiden voneiden force-pushed the link-1408-optimize-distinct branch 3 times, most recently from d867d2c to e47cf32 Compare August 7, 2023 12:17
events/api.py Outdated Show resolved Hide resolved
events/api.py Outdated Show resolved Hide resolved
Copy link
Contributor

@charn charn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this looks awesome! 🍪 :shipit:

Copy link
Contributor

@tuomas777 tuomas777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 🌞

The use of distinct is extremely slow. It's use can be avoided in
one-to-many and many-to-many relations by using
exclude(~Q(relation=something). When excluding relations, Django
automatically uses SQL EXISTS. Negating the Q statement will
essentially create a NOT NOT EXISTS which the database then optimizes
into plain EXISTS.

This appears to be still notably faster than using
filter(Exists(subquery)) which creates an INNER JOIN inside the exists
clause.

Ref LINK-1408
@voneiden voneiden force-pushed the link-1408-optimize-distinct branch from 48c5714 to 3324685 Compare August 8, 2023 09:42
@sonarcloud
Copy link

sonarcloud bot commented Aug 8, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

98.1% 98.1% Coverage
0.0% 0.0% Duplication

@voneiden voneiden merged commit c1d128a into develop Aug 8, 2023
3 checks passed
@voneiden voneiden deleted the link-1408-optimize-distinct branch August 8, 2023 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants